Applying Randomness Effectively Based on Random Forests for Classification Task of Datasets of Insufficient Information

نویسنده

  • Hyontai Sug
چکیده

Random forests are known to be good for data mining of classification tasks, because random forests are robust for datasets having insufficient information possibly with some errors. But applying random forests blindly may not produce good results, and a dataset in the domain of rotogravure printing is one of such datasets. Hence, in this paper, some best classification accuracy based on clever application of random forests to predict the occurrence of cylinder bands in rotogravure printing is investigated. Since random forests could generate good results with an appropriate combination of parameters like the number of randomly selected attributes for each split and the number of trees in the forests, an effective data mining procedure considering the property of the target dataset by way of trial random forests is investigated. The effectiveness of the suggested procedure is shown by experiments with very good results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster-Based Image Segmentation Using Fuzzy Markov Random Field

Image segmentation is an important task in image processing and computer vision which attract many researchers attention. There are a couple of information sets pixels in an image: statistical and structural information which refer to the feature value of pixel data and local correlation of pixel data, respectively. Markov random field (MRF) is a tool for modeling statistical and structural inf...

متن کامل

Breast Cancer Diagnosis from Perspective of Class Imbalance

Introduction: Breast cancer is the second cause of mortality among women. Early detection is the only rescue to reduce the risk of breast cancer mortality. Traditional methods cannot effectively diagnose tumor since they are based on the assumption of well-balanced dataset.. However, a hybrid method can help to alleviate the two-class imbalance problem existing in the ...

متن کامل

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

Mental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals

Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Applied Mathematics

دوره 2012  شماره 

صفحات  -

تاریخ انتشار 2012